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A matching algorithm for the identification of backbones in percolation problems is 
introduced. Using this procedure, percolation backbones are studied in two- to five- 
dimensional systems containing 1.7 X 1CF sites, two orders of magnitude larger than 
was previously possible using burning algorithms. 



1. Introduction 



In percolation models 0, the sites or bonds of a lattice are independently oc- 
cupied with probability p, and the properties of the resulting clusters arc studied. 
Above a critical density p c , a cluster that spans the whole system exists, and one 
says that connectivity percolates. The backbone is defined as the subset of a con- 
nected cluster that carries current when a potential difference is applied between 
two points. The backbone of the spanning cluster determines the macroscopic trans- 
port properties of the system. At the critical concentration p c , connected clusters 
and backbones are known to be fractals, although with different fractal dimensions^ 
While geometrical properties of connected clusters have been numerically studied □ 
on very large systems containing as much as 10 11 sites, backbone studies cl have been 
reduced to relatively small lattices due to the lack of efficient integer algorithms for 
backbone identification. The burning algorithm has been used by several authors, 
but this procedure is strongly CPU-limited and therefore it has not been possible 
to study systems of more than 2.5 x 10 5 sites. 

Matching algorithmshave been recently introduced Qi3'El for the related problem 
of rigidity percolation LrU, and I show here that an extension of these methods can 
be applied to the study of connectivity percolation, allowing very efficient backbone 
identification for systems much larger than was previously possible using the burning 
algorithm. 



2. The matching algorithm 

Consider a system of n sites i — l,...,n connected by an arbitrary set E of 
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Figure 1: a) Before adding a new edge (ih) (dashed line) the following test is done: 
b) arrow (ki) is inverted, uncovering i. c) we uncover h by inverting the entire path 
(hgfe). Since both ends could be simultaneously uncovered, the new edge does not 
close a loop and therefore d) it is definitively added to V. 

b bonds ij connecting sites i and j. Bonds arc initially labelled from 1 to b, and 
sites are labelled from 6+1 to b + n. The matching algorithm can be thought of 
as a clever way to identify and remove loops, and is implemented using a directed 
graph T> as an auxiliary representation of the system. In this directed graph, lattice 
sites are represented by nodes i and bonds by directed edges ij. We may think of 
each edge as an arrow. These can be pointing in either direction, subject to the 
constraint that no node has more than one arrow pointing to it. A node pointed to 
by an arrow will be said to be covered by the corresponding edge. A node with no 
incoming arrows is on the other hand uncovered. The directed graph V will be kept 
looplcss. In order to do this, each time a closed loop (a cycle, or circuit) is identified 
(see later) it will be removed from D, and replaced by a loop-node. A loop-node 
is a node in V that represents a deleted loop. Therefore, although initially there 
are as many nodes in V as there are sites in our system, as the algorithm proceeds 
we will delete some nodes and create loop-nodes. These loop-nodes will be given a 
loop-label, which is the minimum of all edge- and node-labels in the loop. 

We start from a graph V initially containing n nodes and no edges, and add 
edges one at a time. Before adding an edge ab G E to £>, the following test is done 
in order to know if a loop is closed by ab: We attempt to reorganise the existing 
arrows (if any) on P, in order to uncover both a and b simultaneously. Since by 
hypothesis V without edge ab has no loops, it must always be possible to uncover 
one of them. Let us then assume that a is uncovered first. If after doing this b can 
also be uncovered (while keeping a uncovered), then the new edge ab does not close 
a loop, and therefore it is definitively added to T>. This means: we add an arrow 
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Figure 2: a) We attempt to add a new edge (je) (dashed) to T>. b) First arrow 
(ij) is inverted, uncovering j. But now there is no way to uncover e while keeping j 
uncovered. This means that (ej) closes a loop. So now: c) start from e and follow 
arrows backwards, in this way identifying all edges and nodes in the loop (bold 
lines), d) The loop is removed from T> and replaced by a loop- node m. 

between a and b, pointing to either of them (Fig. [l]). 

If on the other hand it is not possible to uncover b while keeping a uncovered 
(Fig. ||), this necessarily means that a loop would be formed on V by the addition 
of ab. In this case, edge ab is not added to V but the following is done instead: 
Starting from b (covered), we follow the arrows backwards. This will necessarily 
lead to a, thus identifying the new loop. All edges in this loop are given a common 
label l m in, which is the minimum amongst all node- and edge-labels in the loop 
(including nodes a and b and edge ab). Next all nodes and edges in the loop are 
removed from T> and replaced by a node with label l m in- 

There is an exception though, as eventually some of the nodes in this loop are 
connected to other nodes outside the loop (for example nodes i, f and g in Fig. ||). 
These are not removed from T> but connected instead to the newly created node 
(node m in Fig. ^ by auxiliary arrows that initially point outwards. These auxiliary 
arrows have the same label as the loop-node. Also, some of the nodes that we do 
remove (nodes j, g and h in Fig. |J) may be eventually needed at later stages, for 
example if we have to add a new edge that connects to one of them. In such a case 
we simply recreate the corresponding node and connect it by means of an auxiliary 
arrow to the loop node. 

The process of replacing identified loops by nodes is called condensation u. It is 
possible to implement the matching algorithm without condensation. In this case, 
one simply does not add edges that close a loop but simply keeps track of the fact 
that a new loop has been formed by giving all loop-edges a common label. The 
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Figure 3: A spanning cluster on a site-diluted square lattice of linear dimension 
L = 64 at the critical point. Bus bars are located on the left and right ends of the 
sample. Boundary conditions are periodic in the vertical direction. The removal 
of any critical bond (thick gray lines) would produce disconnection. In addition 
to those, 'blobs' of multiply connected bonds (thick black lines) also belong to the 
backbone. Dangling ends (thin lines) are connected to the backbone at one point 
only and are not relevant for macroscopic conductivity. 

algorithm is perhaps simpler in this way. Condensation has on the other hand the 
effect of enormously improving speed and reducing memory requirements, since the 
graph size is decreased each time a loop is found. This is extremely important for 
large systems U. 

3. Backbone Identification 

Imagine we want to know if two far apart points a and b on the system are 
connected, and in such case we want to identify the backbone B(a, b) between these 
two points. This can be done by noticing that, if a and b are connected, then a 
long-range bond ab would close a loop. Thus our method above serves to identify 
the parts of the system that belong to this loop, i.e. the backbone B(a, b). 

Therefore at any point we may simply do as if we were to connect a fictitious 
edge ab between two arbitrary points a and b in the form described above, that is: 
we attempt to uncover both a and b on T>. If this is possible, then a and b are not 
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connected. End. If only a but not b can be uncovered, starting from b follow the 
arrows backwards and in this way the backbone between a and b is identified. The 
visited edges are all critical, i.e. cutting one of the would produce disconnection of 
(a, b). These are called red bonds in the percolation literature. Some of the visited 
nodes will be loop-nodes. These are also included in the backbone, i.e. all edges 
that form the loops are, as well as all loop-nodes in them, in a recursive manner. 
Loop nodes are what is called blobs in the percolation literature. They are multiply 
connected so they contain no red bonds. If after identifying the backbone one is 
interested in also identifying the dangling ends, this can be done by 'testing' a fic- 
titious bond between the backbone and all other nodes in the system. If a node is 
connected to the backbone then one cannot uncover both ends simultaneously. Thus 
the whole spanning cluster can be identified in this way (Fig. ||). But let us stress 
that dangling-end identification is not one of the strengths of the matching algo- 
rithm. This would be more efficiently done using for example a Hoshen-Kopelmann 
type algorithm El. 

This is just a brief sketch of the main ideas about the matching algorithm. 
Several important issues have not been discussed in this paper, as for example 
how exactly nodes are uncovered in an efficient manner, data structures needed for 
computational speed, as well as a precise discussion of the relevant properties of the 
algorithm. A more detailed description will be published elsewhere liil 

4. Results 

We present here partial results for scalar percolation on site-diluted hypercubic 
lattices in 2 to 5 dimensions. We start with a system containing no sites and add 
them one at a time at random locations. Each time a site is occupied, all new 
induced edges are tested and added to V one at a time. Induced edges are those 
connecting the new site to already occupied nearest-neighbours. Loops are identified 
and removed as described in the previous sections. We proceed in this way until 
the percolation point is reached, which is detected because of the existence of a 
fictitious long-range bond connecting opposite sides of the sample. At this point, 
the density of occupied sites gives p c for this sample. We identify and measure the 
whole backbone (red bonds plus blobs) as well as the dangling ends at p c for each 
sample in this manner. 

4.1. CPU time 

Figure |^ shows CPU times per sample in seconds, needed to identify the whole 
backbone as well as the spanning cluster at p c versus linear sample size. Approx- 
imately half of that time is needed to identify the backbone alone, and the rest is 
used to obtain the dangling ends. Largest sizes simulated were L — 4096 in 2D, 
L = 256 in 3D, L = 60 in 4D and L = 26 in 5D. The number of samples varies 
between some thousands for the largest sizes to some millions for the smaller ones 
in each dimension. All runs were done on Alpha-500 workstations. Fits of the data 
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Figure 4: CPU time t, in seconds per sample versus linear size L, needed to identify 
the whole spanning cluster of scalar connectivity at p c . Identifying the backbone 
alone would take half of that approximately. The time-complexity exponent 9 in 
t ~ n e — L de is approximately one in all cases. 

in Fig. ^ show that CPU-time scales with system size n as t ~ n e with 9 = 1.07 
(2D), 1.06 (3D), 1.05 (4D) and 1.05 (5D). 

4.2. Correlation length exponent v 

Since we add sites one at a time until the percolation point is reached, we are 
able to measure, for each sample, the critical density p c of occupied sites. The 
fluctuation <tl =< Pc >l ~ < Pc >\ of this quantity, were <>l indicates averages 
overs samples of size L, is expected to scale as ol ~ L~ x l v with system size □. Here 
v is the correlation length exponent. On the other hand, rigorous arguments due 
to Coniglio i show that the number Rl of red bonds grows at p c as L Y I V . 

Thus measuring er^ and Rl , two independent estimates for the thermal exponent 
1/v are obtained. Figure || shows plots of — log(oL)/ log(L) and log(i?L)/ log(i) 
versus l/log(L). We fit these data using 

■j-,Ri~ L 1 /" (1 + aL-") (1) 

where oj is a corrections-to-scaling exponent. This allows the following estimates: 
1/v = 0.75 ± 0.01 (2D), 1.13 ± 0.02 (3D), 1.44 ± 0.05 (4D) and 1.63 ± 0.05 (5D). 
Correction to scaling exponents were found in most cases to be between 0.5 and 
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Figure 5: Fits of — log(cTL)/ log(L) and log(i?i)/ log(L) versus l/log(i) for scalar 
percolation in 2 to 5 dimensions. The intercept at l/log(i) = is the estimated 
value of 1/v. 

0.8, but precise values cannot be given for these lattice sizes. 

4.3. Backbone density at p c 

The fraction B{L) of bonds on the backbone at p c is measured for several system 
sizes L in each dimension. Our results are shown in Fig. ^|. 
Now assume B(L) to behave as 

B(L) = XL-P'^il + aL-") (2) 

where db = d — (3' /is is the backbone fractal dimension, and u> is an exponent of 
corrections to scaling. Fitting this expression with 4 free parameters to our data, we 
obtain the following estimates for the backbone fractal dimension: dj, = 1.650±0.005 
(2D), 1.86 ±0.01 (3D), 1.95 ±0.05 (4D) and 2.00 ± 0.05 (5D). 
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Figure 6: Backbone densities at p c for scalar percolation in 2 to 5 dimensions. Lines 
correspond to best fits using (g). 
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